Storage Access Characteristics of Computational Science Applications
نویسندگان
چکیده
Computational science applications are driving a demand for increasingly powerful storage systems. While many techniques are available for capturing the I/O behavior of individual application trial runs and specific components of the storage system, continuous characterization of a production system remains a daunting challenge for systems with hundreds of thousands of compute cores and multiple petabytes of storage. As a result, these storage systems are often designed without a clear understanding of the diverse computational science workloads they will support. In this study, we outline a holistic methodology for scalable, systemwide I/O characterization that combines storage device instrumentation and static file system analysis with a new mechanism for capturing detailed, application-level behavior. We demonstrate the effectiveness of our methodology by performing a multilevel, two-month study of Intrepid, a 557-teraflop IBM Blue Gene/P system. During that time, we captured applicationlevel I/O characterizations from 6,481 unique jobs spanning 38 science and engineering projects with up to 163,840 processes per job. We also captured patterns of I/O activity in over 8 petabytes of block device traffic and summarized the contents of file systems containing over 191 million files. From this collection of data we are able to quantify systemwide trends such as how application behavior changes with job size, the “burstiness” of the storage system, and the change in file system contents over time. We also identify the top ten storage users by application domain and investigate how their I/O strategies relate to I/O performance. One of these applications is then selected as a case study in I/O tuning based on integrated I/O characterization. We then use the results of our study to highlight trends that will affect the design of future storage systems, and we identify opportunities for improvement in I/O characterization methodology.
منابع مشابه
An Overview of Novel Energy Storage Systems with Air Compression Method
With the increasing use of renewable energy systems and the volatility of access to this type of energy, needs energy storage systems to sustain the system. In the meantime, energy storage systems have distinct characteristics and applications, one of which is the compressed air energy storage system. In the present paper, the newest researches and novel systems in the field of energy storage b...
متن کاملData Replication-Based Scheduling in Cloud Computing Environment
Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...
متن کاملEvaluation of Energy Storage Technologies and Applications Pinpointing Renewable Energy Resources Intermittency Removal
Renewable energy sources (RES), especially wind power plants, have high priority of promotion in the energy policies worldwide. An increasing share of RES and distributed generation (DG), should, as has been assumed, provide improvement in reliability of electricity delivery to the customers. Paper presented here concentrates on electricity storage systems technologies and applications pinpoint...
متن کاملEvaluation of Energy Storage Technologies and Applications Pinpointing Renewable Energy Resources Intermittency Removal
Renewable energy sources (RES), especially wind power plants, have high priority of promotion in the energy policies worldwide. An increasing share of RES and distributed generation (DG), should, as has been assumed, provide improvement in reliability of electricity delivery to the customers. Paper presented here concentrates on electricity storage systems technologies and applications pinpoint...
متن کاملA Understanding and Improving Computational Science Storage Access through Continuous Characterization
Computational science applications are driving a demand for increasingly powerful storage systems. While many techniques are available for capturing the I/O behavior of individual application trial runs and specific components of the storage system, continuous characterization of a production system remains a daunting challenge for systems with hundreds of thousands of compute cores and multipl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010